https://rss.onlinelibrary.wiley.com/doi/10.1111/j.1740-9713.2018.01169.x
http://www.youtube.com/watch?v=XcBLEVknqvY
https://www.rstudio.com/products/rstudio/download/
https://moderndive.com/2-getting-started.html
https://cran.r-project.org/web/packages/addinslist/README.html
https://rstudio.github.io/rstudioaddins/
# devtools::install_github("rstudio/addinexamples", type = "source")
Aynı şeyi çok fazla şekilde yapmak mümkün
R Syntax Comparison::CHEAT SHEET
https://www.amelia.mn/Syntax-cheatsheet.pdf
#RStats — There are always several ways to do the same thing… nice example on with the identity matrix by @TeaStats https://t.co/O3GXdPiM32
— Colin Fay 🤘 (@_ColinFay) April 1, 2019
I love the #rstats community.
— Frank Elavsky ᴰᵃᵗᵃ ᵂᶦᶻᵃʳᵈ (@Frankly_Data) July 3, 2018
Someone is like, “oh hey peeps, I saw a big need for this mundane but difficult task that I infrequently do, so I created a package that will literally scrape the last bits of peanut butter out of the jar for you. It's called pbplyr.”
What a tribe.
https://blog.mitchelloharawild.com/blog/user-2018-feature-wall/
Available CRAN Packages By Name
https://cran.r-project.org/web/packages/available_packages_by_name.html
CRAN Task Views
https://cran.r-project.org/web/views/
Bioconductor
https://www.bioconductor.org
RecommendR
http://recommendr.info/
pkgsearch
CRAN package search
https://github.com/metacran/pkgsearch
CRANsearcher
https://github.com/RhoInc/CRANsearcher
Awesome R
https://awesome-r.com/
install.packages("tidyverse", dependencies = TRUE)
install.packages("jmv", dependencies = TRUE)
install.packages("questionr", dependencies = TRUE)
install.packages("Rcmdr", dependencies = TRUE)
install.packages("summarytools")
# install.packages("tidyverse", dependencies = TRUE)
# install.packages("jmv", dependencies = TRUE)
# install.packages("questionr", dependencies = TRUE)
# install.packages("Rcmdr", dependencies = TRUE)
# install.packages("summarytools")
# require(tidyverse)
# require(jmv)
# require(questionr)
# library(summarytools)
# library(gganimate)
# ?mean
# ??efetch
# help(merge)
# example(merge)
RDocumentation https://www.rdocumentation.org
R Package Documentation https://rdrr.io/
GitHub
Stackoverflow
How I use #rstats
— Emily Bovee (@ebovee09) August 10, 2018
h/t @ThePracticalDev pic.twitter.com/erRnTG0Ujr
[R] yazmak da işe yarayabiliyor.http://cran.r-project.org/doc/contrib/Baggott-refcard-v2.pdf
https://www.rstudio.com/resources/cheatsheets/
https://github.com/qinwf/awesome-R#readme
https://twitter.com/hashtag/rstats?src=hash
Got a question to ask on @SlackHQ or post on @github? No time to read the long post on how to use reprex? Here is a 20-second gif for you to format your R codes nicely and for others to reproduce your problem. (An example from a talk given by @JennyBryan) #rstat pic.twitter.com/gpuGXpFIsX
— ZhiYang (@zhiiiyang) October 18, 2018
https://support.rstudio.com/hc/en-us/articles/218611977-Importing-Data-with-RStudio
Spreadsheet users using #rstats: where's the data?#rstats users using spreadsheets: where's the code?
— Leonard Kiefer (@lenkiefer) July 7, 2018
# library(nycflights13)
# summary(flights)
View(data)
data
head
tail
glimpse
str
skimr::skim()
questionr paketi kullanılacak
https://juba.github.io/questionr/articles/recoding_addins.html
summary()
mean
median
min
max
sd
table()
library(readr)
# irisdata <- read_csv("data/iris.csv")
# jmv::descriptives(
# data = irisdata,
# vars = "Sepal.Length",
# splitBy = "Species",
# freq = TRUE,
# hist = TRUE,
# dens = TRUE,
# bar = TRUE,
# box = TRUE,
# violin = TRUE,
# dot = TRUE,
# mode = TRUE,
# sum = TRUE,
# sd = TRUE,
# variance = TRUE,
# range = TRUE,
# se = TRUE,
# skew = TRUE,
# kurt = TRUE,
# quart = TRUE,
# pcEqGr = TRUE)
# install.packages("scatr")
# scatr::scat(
# data = irisdata,
# x = "Sepal.Length",
# y = "Sepal.Width",
# group = "Species",
# marg = "dens",
# line = "linear",
# se = TRUE)
https://cran.r-project.org/web/packages/summarytools/vignettes/Introduction.html
library(summarytools)
Registered S3 method overwritten by 'pryr':
method from
print.bytes Rcpp
summarytools::freq(iris$Species, style = "rmarkdown")
Type: Factor
| Freq | % Valid | % Valid Cum. | % Total | % Total Cum. | |
|---|---|---|---|---|---|
| setosa | 50 | 33.33 | 33.33 | 33.33 | 33.33 |
| versicolor | 50 | 33.33 | 66.67 | 33.33 | 66.67 |
| virginica | 50 | 33.33 | 100.00 | 33.33 | 100.00 |
| <NA> | 0 | 0.00 | 100.00 | ||
| Total | 150 | 100.00 | 100.00 | 100.00 | 100.00 |
summarytools::freq(iris$Species, report.nas = FALSE, style = "rmarkdown", omit.headings = TRUE)
'omit.headings' argument has been replaced by 'headings'; setting headings = FALSE
| Freq | % | % Cum. | |
|---|---|---|---|
| setosa | 50 | 33.33 | 33.33 |
| versicolor | 50 | 33.33 | 66.67 |
| virginica | 50 | 33.33 | 100.00 |
| Total | 150 | 100.00 | 100.00 |
with(tobacco, print(ctable(smoker, diseased), method = 'render'))
| diseased | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| smoker | Yes | No | Total | |||||||||
| Yes | 125 | ( | 41.9% | ) | 173 | ( | 58.1% | ) | 298 | ( | 100.0% | ) |
| No | 99 | ( | 14.1% | ) | 603 | ( | 85.9% | ) | 702 | ( | 100.0% | ) |
| Total | 224 | ( | 22.4% | ) | 776 | ( | 77.6% | ) | 1000 | ( | 100.0% | ) |
Generated by summarytools 0.9.3 (R version 3.6.0)
2019-06-30
with(tobacco,
print(ctable(smoker, diseased, prop = 'n', totals = FALSE),
omit.headings = TRUE, method = "render"))
'omit.headings' will disappear in future releases; use 'headings' instead
| diseased | ||
|---|---|---|
| smoker | Yes | No |
| Yes | 125 | 173 |
| No | 99 | 603 |
Generated by summarytools 0.9.3 (R version 3.6.0)
2019-06-30
summarytools::descr(iris, style = "rmarkdown")
Non-numerical variable(s) ignored: Species
N: 150
| Petal.Length | Petal.Width | Sepal.Length | Sepal.Width | |
|---|---|---|---|---|
| Mean | 3.76 | 1.20 | 5.84 | 3.06 |
| Std.Dev | 1.77 | 0.76 | 0.83 | 0.44 |
| Min | 1.00 | 0.10 | 4.30 | 2.00 |
| Q1 | 1.60 | 0.30 | 5.10 | 2.80 |
| Median | 4.35 | 1.30 | 5.80 | 3.00 |
| Q3 | 5.10 | 1.80 | 6.40 | 3.30 |
| Max | 6.90 | 2.50 | 7.90 | 4.40 |
| MAD | 1.85 | 1.04 | 1.04 | 0.44 |
| IQR | 3.50 | 1.50 | 1.30 | 0.50 |
| CV | 0.47 | 0.64 | 0.14 | 0.14 |
| Skewness | -0.27 | -0.10 | 0.31 | 0.31 |
| SE.Skewness | 0.20 | 0.20 | 0.20 | 0.20 |
| Kurtosis | -1.42 | -1.36 | -0.61 | 0.14 |
| N.Valid | 150.00 | 150.00 | 150.00 | 150.00 |
| Pct.Valid | 100.00 | 100.00 | 100.00 | 100.00 |
descr(iris, stats = c("mean", "sd", "min", "med", "max"), transpose = TRUE,
omit.headings = TRUE, style = "rmarkdown")
'omit.headings' argument has been replaced by 'headings'; setting headings = FALSE
Non-numerical variable(s) ignored: Species
| Mean | Std.Dev | Min | Median | Max | |
|---|---|---|---|---|---|
| Petal.Length | 3.76 | 1.77 | 1.00 | 4.35 | 6.90 |
| Petal.Width | 1.20 | 0.76 | 0.10 | 1.30 | 2.50 |
| Sepal.Length | 5.84 | 0.83 | 4.30 | 5.80 | 7.90 |
| Sepal.Width | 3.06 | 0.44 | 2.00 | 3.00 | 4.40 |
# view(dfSummary(iris))
dfSummary(tobacco, plain.ascii = FALSE, style = "grid")
text graphs are displayed; set 'tmp.img.dir' parameter to activate png graphs
Dimensions: 1000 x 9
Duplicates: 2
| No | Variable | Stats / Values | Freqs (% of Valid) | Graph | Valid | Missing |
|---|---|---|---|---|---|---|
| 1 | gender [factor] |
1. F 2. M |
489 (50.0%) 489 (50.0%) |
IIIIIIIIII IIIIIIIIII |
978 (97.8%) |
22 (2.2%) |
| 2 | age [numeric] |
Mean (sd) : 49.6 (18.3) min < med < max: 18 < 50 < 80 IQR (CV) : 32 (0.4) |
63 distinct values | . . . . . : : : : : : . : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : |
975 (97.5%) |
25 (2.5%) |
| 3 | age.gr [factor] |
1. 18-34 2. 35-50 3. 51-70 4. 71 + |
258 (26.5%) 241 (24.7%) 317 (32.5%) 159 (16.3%) |
IIIII IIII IIIIII III |
975 (97.5%) |
25 (2.5%) |
| 4 | BMI [numeric] |
Mean (sd) : 25.7 (4.5) min < med < max: 8.8 < 25.6 < 39.4 IQR (CV) : 5.7 (0.2) |
974 distinct values | : : : : : : : : : : : : . : : : : : . |
974 (97.4%) |
26 (2.6%) |
| 5 | smoker [factor] |
1. Yes 2. No |
298 (29.8%) 702 (70.2%) |
IIIII IIIIIIIIIIIIII |
1000 (100%) |
0 (0%) |
| 6 | cigs.per.day [numeric] |
Mean (sd) : 6.8 (11.9) min < med < max: 0 < 0 < 40 IQR (CV) : 11 (1.8) |
37 distinct values | : : : : : . . . . . . |
965 (96.5%) |
35 (3.5%) |
| 7 | diseased [factor] |
1. Yes 2. No |
224 (22.4%) 776 (77.6%) |
IIII IIIIIIIIIIIIIII |
1000 (100%) |
0 (0%) |
| 8 | disease [character] |
1. Hypertension 2. Cancer 3. Cholesterol 4. Heart 5. Pulmonary 6. Musculoskeletal 7. Diabetes 8. Hearing 9. Digestive 10. Hypotension [ 3 others ] |
36 (16.2%) 34 (15.3%) 21 ( 9.5%) 20 ( 9.0%) 20 ( 9.0%) 19 ( 8.6%) 14 ( 6.3%) 14 ( 6.3%) 12 ( 5.4%) 11 ( 5.0%) 21 ( 9.5%) |
III III I I I I I I I I |
222 (22.2%) |
778 (77.8%) |
| 9 | samp.wgts [numeric] |
Mean (sd) : 1 (0.1) min < med < max: 0.9 < 1 < 1.1 IQR (CV) : 0.2 (0.1) |
0.86!: 267 (26.7%) 1.04!: 249 (24.9%) 1.05!: 324 (32.4%) 1.06!: 160 (16.0%) ! rounded |
IIIII IIII IIIIII III |
1000 (100%) |
0 (0%) |
# First save the results
iris_stats_by_species <- by(data = iris,
INDICES = iris$Species,
FUN = descr, stats = c("mean", "sd", "min", "med", "max"),
transpose = TRUE)
# Then use view(), like so:
view(iris_stats_by_species, method = "pander", style = "rmarkdown")
Non-numerical variable(s) ignored: Species
Group: Species = setosa
N: 50
| Mean | Std.Dev | Min | Median | Max | |
|---|---|---|---|---|---|
| Petal.Length | 1.46 | 0.17 | 1.00 | 1.50 | 1.90 |
| Petal.Width | 0.25 | 0.11 | 0.10 | 0.20 | 0.60 |
| Sepal.Length | 5.01 | 0.35 | 4.30 | 5.00 | 5.80 |
| Sepal.Width | 3.43 | 0.38 | 2.30 | 3.40 | 4.40 |
Group: Species = versicolor
N: 50
| Mean | Std.Dev | Min | Median | Max | |
|---|---|---|---|---|---|
| Petal.Length | 4.26 | 0.47 | 3.00 | 4.35 | 5.10 |
| Petal.Width | 1.33 | 0.20 | 1.00 | 1.30 | 1.80 |
| Sepal.Length | 5.94 | 0.52 | 4.90 | 5.90 | 7.00 |
| Sepal.Width | 2.77 | 0.31 | 2.00 | 2.80 | 3.40 |
Group: Species = virginica
N: 50
| Mean | Std.Dev | Min | Median | Max | |
|---|---|---|---|---|---|
| Petal.Length | 5.55 | 0.55 | 4.50 | 5.55 | 6.90 |
| Petal.Width | 2.03 | 0.27 | 1.40 | 2.00 | 2.50 |
| Sepal.Length | 6.59 | 0.64 | 4.90 | 6.50 | 7.90 |
| Sepal.Width | 2.97 | 0.32 | 2.20 | 3.00 | 3.80 |
# view(iris_stats_by_species)
data(tobacco) # tobacco is an example dataframe included in the package
BMI_by_age <- with(tobacco,
by(BMI, age.gr, descr,
stats = c("mean", "sd", "min", "med", "max")))
view(BMI_by_age, "pander", style = "rmarkdown")
Data Frame: tobacco
N: 258
| 18-34 | 35-50 | 51-70 | 71 + | |
|---|---|---|---|---|
| Mean | 23.84 | 25.11 | 26.91 | 27.45 |
| Std.Dev | 4.23 | 4.34 | 4.26 | 4.37 |
| Min | 8.83 | 10.35 | 9.01 | 16.36 |
| Median | 24.04 | 25.11 | 26.77 | 27.52 |
| Max | 34.84 | 39.44 | 39.21 | 38.37 |
BMI_by_age <- with(tobacco,
by(BMI, age.gr, descr, transpose = TRUE,
stats = c("mean", "sd", "min", "med", "max")))
view(BMI_by_age, "pander", style = "rmarkdown", omit.headings = TRUE)
'omit.headings' will disappear in future releases; use 'headings' instead
| Mean | Std.Dev | Min | Median | Max | |
|---|---|---|---|---|---|
| 18-34 | 23.84 | 4.23 | 8.83 | 24.04 | 34.84 |
| 35-50 | 25.11 | 4.34 | 10.35 | 25.11 | 39.44 |
| 51-70 | 26.91 | 4.26 | 9.01 | 26.77 | 39.21 |
| 71 + | 27.45 | 4.37 | 16.36 | 27.52 | 38.37 |
tobacco_subset <- tobacco[ ,c("gender", "age.gr", "smoker")]
freq_tables <- lapply(tobacco_subset, freq)
# view(freq_tables, footnote = NA, file = 'freq-tables.html')
what.is(iris)
$properties property value 1 class data.frame 2 typeof list 3 mode list 4 storage.mode list 5 dim 150 x 5 6 length 5 7 is.object TRUE 8 object.type S3 9 object.size 7256 Bytes
$attributes.lengths names class row.names 5 1 150
$extensive.is [1] “is.data.frame” “is.list” “is.object” “is.recursive” [5] “is.unsorted”
freq(tobacco$gender, style = 'rmarkdown')
## ### Frequencies
## #### tobacco$gender
## **Type:** Factor
##
## | | Freq | % Valid | % Valid Cum. | % Total | % Total Cum. |
## |-----------:|-----:|--------:|-------------:|--------:|-------------:|
## | **F** | 489 | 50.00 | 50.00 | 48.90 | 48.90 |
## | **M** | 489 | 50.00 | 100.00 | 48.90 | 97.80 |
## | **\<NA\>** | 22 | | | 2.20 | 100.00 |
## | **Total** | 1000 | 100.00 | 100.00 | 100.00 | 100.00 |
print(freq(tobacco$gender), method = 'render')
| Valid | Total | ||||
|---|---|---|---|---|---|
| gender | Freq | % | % Cum. | % | % Cum. |
| F | 489 | 50.00 | 50.00 | 48.90 | 48.90 |
| M | 489 | 50.00 | 100.00 | 48.90 | 97.80 |
| <NA> | 22 | 2.20 | 100.00 | ||
| Total | 1000 | 100.00 | 100.00 | 100.00 | 100.00 |
Generated by summarytools 0.9.3 (R version 3.6.0)
2019-06-30
library(skimr)
skim(df)
# library(ggplot2)
# library(mosaic)
# mPlot(irisdata)
ctable(tobacco$gender, tobacco$smoker, style = 'rmarkdown')
Data Frame: tobacco
| smoker | Yes | No | Total | |
| gender | ||||
| F | 147 (30.1%) | 342 (69.9%) | 489 (100.0%) | |
| M | 143 (29.2%) | 346 (70.8%) | 489 (100.0%) | |
| <NA> | 8 (36.4%) | 14 (63.6%) | 22 (100.0%) | |
| Total | 298 (29.8%) | 702 (70.2%) | 1000 (100.0%) |
print(ctable(tobacco$gender, tobacco$smoker), method = 'render')
| smoker | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| gender | Yes | No | Total | |||||||||
| F | 147 | ( | 30.1% | ) | 342 | ( | 69.9% | ) | 489 | ( | 100.0% | ) |
| M | 143 | ( | 29.2% | ) | 346 | ( | 70.8% | ) | 489 | ( | 100.0% | ) |
| <NA> | 8 | ( | 36.4% | ) | 14 | ( | 63.6% | ) | 22 | ( | 100.0% | ) |
| Total | 298 | ( | 29.8% | ) | 702 | ( | 70.2% | ) | 1000 | ( | 100.0% | ) |
Generated by summarytools 0.9.3 (R version 3.6.0)
2019-06-30
descr(tobacco, style = 'rmarkdown')
print(descr(tobacco), method = 'render', table.classes = 'st-small')
dfSummary(tobacco, style = 'grid', plain.ascii = FALSE)
print(dfSummary(tobacco, graph.magnif = 0.75), method = 'render')
Here, building up a #ggplot2 as slowly as possible, #rstats. Incremental adjustments. #rstatsteachingideas pic.twitter.com/nUulQl8bPh
— Gina Reynolds (@EvaMaeRey) August 13, 2018
Dreaming of a fancy #Rstats #ggplot #dataviz but still scared of typing #code? @_pvictorr esquisse package has you covered https://t.co/1vIDXcVAAF pic.twitter.com/RlTkptnrNv
— Radoslaw Panczak (@RPanczak) October 2, 2018
Link
library(Rcmdr)
Rcmdr::Commander()
http://r4stats.com/articles/software-reviews/r-commander/
# Save Final Data
saved data after analysis to `Data-After-Analysis.xlsx`.
saveRDS(mydata, "Data-After-Analysis.rds")
writexl::write_xlsx(mydata, "Data-After-Analysis.xlsx")
file.info("Data-After-Analysis.xlsx")$ctime
citation()
##
## To cite R in publications use:
##
## R Core Team (2019). R: A language and environment for
## statistical computing. R Foundation for Statistical Computing,
## Vienna, Austria. URL https://www.R-project.org/.
##
## A BibTeX entry for LaTeX users is
##
## @Manual{,
## title = {R: A Language and Environment for Statistical Computing},
## author = {{R Core Team}},
## organization = {R Foundation for Statistical Computing},
## address = {Vienna, Austria},
## year = {2019},
## url = {https://www.R-project.org/},
## }
##
## We have invested a lot of time and effort in creating R, please
## cite it when using it for data analysis. See also
## 'citation("pkgname")' for citing R packages.
citation("tidyverse")
citation("foreign")
citation("tidylog")
citation("janitor")
citation("jmv")
citation("tangram")
citation("finalfit")
citation("summarytools")
citation("ggstatplot")
citation("readxl")
report::cite_packages(session = sessionInfo())
References
1 Dominic Comtois (2019). summarytools: Tools to Quickly and Neatly Summarize Data. R package version 0.9.3. https://CRAN.R-project.org/package=summarytools 2 Hadley Wickham, Jim Hester and Romain Francois (2018). readr: Read Rectangular Text Data. R package version 1.3.1. https://CRAN.R-project.org/package=readr
sessionInfo()
## R version 3.6.0 (2019-04-26)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Mojave 10.14.5
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] summarytools_0.9.3 readr_1.3.1
##
## loaded via a namespace (and not attached):
## [1] zoo_1.8-6 tidyselect_0.2.5 report_0.1.0
## [4] xfun_0.8 performance_0.2.0 purrr_0.3.2
## [7] pander_0.6.3 splines_3.6.0 lattice_0.20-38
## [10] parameters_0.1.0 tcltk_3.6.0 vctrs_0.1.99.9000
## [13] htmltools_0.3.6 yaml_2.2.0 survival_2.44-1.1
## [16] rlang_0.4.0.9000 pillar_1.4.1 glue_1.3.1
## [19] estimate_0.1.0 pryr_0.1.4 matrixStats_0.54.0
## [22] emmeans_1.3.5.1 multcomp_1.4-10 plyr_1.8.4
## [25] stringr_1.4.0 bayestestR_0.2.2 mvtnorm_1.0-11
## [28] codetools_0.2-16 coda_0.19-2 evaluate_0.14
## [31] knitr_1.23 TH.data_1.0-10 Rcpp_1.0.1
## [34] xtable_1.8-4 backports_1.1.4 checkmate_1.9.3
## [37] magick_2.0 rapportools_1.0 hms_0.4.2
## [40] digest_0.6.19 stringi_1.4.3 insight_0.3.0
## [43] dplyr_0.8.1 grid_3.6.0 tools_3.6.0
## [46] bitops_1.0-6 sandwich_2.5-1 magrittr_1.5
## [49] RCurl_1.95-4.12 tibble_2.1.3 crayon_1.3.4
## [52] tidyr_0.8.3.9000 pkgconfig_2.0.2 zeallot_0.1.0
## [55] MASS_7.3-51.4 ellipsis_0.2.0.9000 Matrix_1.2-17
## [58] correlation_0.1.0 estimability_1.3 lubridate_1.7.4
## [61] assertthat_0.2.1 rmarkdown_1.13 boot_1.3-22
## [64] R6_2.4.0 compiler_3.6.0
Completed on 2019-06-30 19:37:12.
Serdar Balci, MD, Pathologist
d rserdarbalci@gmail.com
https://rpubs.com/sbalci/CV
Bu bir derlemedir, mümkün mertebe alıntılara referans vermeye çalıştım.↩